Learning to Play Trajectory Games Against Opponents With Unknown Objectives
نویسندگان
چکیده
Many autonomous agents, such as intelligent vehicles, are inherently required to interact with one another. Game theory provides a natural mathematical tool for robot motion planning in interactive settings. However, tractable algorithms problems usually rely on strong assumption, namely that the objectives of all players scene known. To make tools applicable ego-centric only local information, we propose an adaptive model-predictive game solver, which jointly infers other players' online and computes corresponding generalized Nash equilibrium (GNE) strategy. The adaptivity our approach is enabled by differentiable trajectory solver whose gradient signal used maximum likelihood estimation (MLE) opponents' objectives. This differentiability pipeline facilitates direct integration elements, neural networks (NNs). Furthermore, contrast existing solvers cost inference games, method handles not partial state observations but also general inequality constraints. In two simulated traffic scenarios, find superior performance over both game-theoretic methods non-game-theoretic control (MPC) approaches. We demonstrate approach's real-time capabilities robustness two-player hardware experiments.
منابع مشابه
Playing repeated Stackelberg games with unknown opponents
In Stackelberg games, a “leader” player first chooses a mixed strategy to commit to, then a “follower” player responds based on the observed leader strategy. Notable strides have been made in scaling up the algorithms for such games, but the problem of finding optimal leader strategies spanning multiple rounds of the game, with a Bayesian prior over unknown follower preferences, has been left u...
متن کاملLearning against sequential opponents in repeated stochastic games
This article considers multiagent algorithms that aim to find the best response in strategic interactions by learning about the game and their opponents from observations. In contrast to many state-of-the-art algorithms that assume repeated interaction with a fixed set of opponents (or even self-play), a learner in the real world is more likely to encounter the same strategic situation with cha...
متن کاملLearning against opponents with bounded memory
Recently, a number of authors have proposed criteria for evaluating learning algorithms in multiagent systems. While well-justified, each of these has generally given little attention to one of the main challenges of a multi-agent setting: the capability of the other agents to adapt and learn as well. We propose extending existing criteria to apply to a class of adaptive opponents with bounded ...
متن کاملUsing Transfer Learning to Model Unknown Opponents in Automated Negotiations
Modeling unknown opponents is known as a key factor for the efficiency of automated negotiations. The learning processes are however challenging because of (1) the indirect way the target function can be observed, and (2) the limited amount of experience available to learn from an unknown opponent at a single session. To address these difficulties we propose to adopt two approaches from transfe...
متن کاملHow do people play against Nash opponents in games which have a mixed strategy equilibrium?
We examine experimentally how humans behave when they, unbeknownst to them, play against a computer which implements its part of a mixed strategy Nash equilibrium. We consider two games, one zero-sum and another unprofitable with a pure minimax strategy. A minority of subjects’ play was consistent with their Nash equilibrium strategy. But a larger percentage of subjects’ play was more consisten...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE robotics and automation letters
سال: 2023
ISSN: ['2377-3766']
DOI: https://doi.org/10.1109/lra.2023.3280809